Translating conversational speech to standard linguistic form
نویسندگان
چکیده
This paper describes the so-called ill-formed nature of spontaneous conversational speech as observed from the study of a 1500-hour corpus of recorded dialogue speech. We note that the structure is quite different from that of more formal speech or writing and propose a Statistical Machine Translation approach for mapping between the spoken and written forms of the language as if they were two entirely separate languages. We further posit that the particular nature of the spoken language is especially well suited for the display of affective states, inter-speaker relationships and discourse management information. In summary, both modes of communication appear to be particularly suited to their pragmatic function, neither is ill-formed, and it appears possible to map automatically between the two. This mapping has applications in speech technology for the processing of conversational speech.
منابع مشابه
Ethnomethodology and Conversational Analysis
In a speech community, people utilize their communicative competence which they have acquired from their society as part of their distinctive sociolinguistic identity. They negotiate and share meanings, because they have commonsense knowledge about the world, and have universal practical reasoning. Their commonsense knowledge is embodied in their language. Thus, not only does social life depend...
متن کاملPronunciation variation in read and conversational austrian German
This paper presents the first large-scale analysis of pronunciation variation in conversational Austrian German. Whereas for the varieties of German spoken in Germany, conversational speech has been given noticeable attention in the fields of linguistics and automatic speech recognition, for conversational Austrian there is a lack in speech resources and tools as well as linguistic and phonetic...
متن کاملTowards a unified framework for sub-lexical and supra-lexical linguistic modeling
Conversational interfaces have received much attention as a promising natural communication channel between humans and computers. A typical conversational interface consists of three major systems: speech understanding, dialog management and spoken language generation. In such a conversational interface, speech recognition as the front-end of speech understanding remains to be one of the fundam...
متن کاملTowards a Unified Framework
Conversational interfaces have received much attention as a promising natural communication channel between humans and computers. A typical conversational interface consists of three major systems: speech understanding, dialog management and spoken language generation. In such a conversational interface, speech recognition as the front-end of speech understanding remains to be one of the fundam...
متن کاملPronunciation variant analysis using speaking style parallel corpus
To improve the recognition accuracy for spontaneous conversational speech, we collected a corpus to study how spontaneous conversational speech differs from read style speech. The corpus consists of two parts: 1) spontaneous conversational speech and 2) read speech with the same word transcriptions as the conversational speech. In word and phone recognition experiments, it was confirmed that, f...
متن کامل